PLANT PROTEINS 
DESCRIPTION 

The present invention relates the proteins having 
biological activity in plant and animal systems, to 
polynucleotides encoding for the expression of such 
proteins, to oligonucleotides for use in identifying and 
synthesizing these proteins and polynucleotides, to 
vectors and cells containing the polynucleotides in 
recombinant form and to plants and animals comprising 
these, and to the use of the proteins and polynucleotides 
and fragments thereof in the control of plant growth and 
plant vulnerability to viruses. 

Cell cycle progression is regulated by positive and 
negative effectors. Among the latter, the product of the 
retinoblastoma susceptibility gene (Rb) controls the 
passage of mammalian cells through Gl phase. In mammalian 
cells, Rb regulates Gl/S transit by inhibiting the 
function of the E2F family of transcription factors, 
known to interact with sequences in the promoter region 
of genes required for cellular DNA replication (see eg 
Weinberg, R. A. Cell 81,323 (1995); Nevins, J.R. Science 
258,424 (1992)). DNA tumor viruses that infect animal 
cells express oncoproteins that interact with the Rb 
protein via a LXCXE motif, disrupting Rb— E2F complexes 
and driving cells into S-phase (Weinberg ibid; Ludlow, J. 
W. FASEB J. 7, 866 (1993); Moran, E. FASEB J. 7, 880 
(1993); Vousden, K. FASEB J. 7, 872 (1993)). 

The present inventors have shown that efficient 
replication of a plant geminivirus requires the integrity 
of an LXCXE amino acid motif in the viral RepA protein 
and that RepA can interact with members of the human Rb 
family in yeast (Xie, Q. , Suarez-Lopez , P. and Gutierrez, 
C. EMBO J. 14, 4073, (1995). The presence of the LXCXE 
motif in plant D-type cyclins has also been reported 
(Soni, R., Carmichael, J. p., Shah, Z. H. and Murray, J. 



A- H. Plant Cell 7, 85-103 (1995)). 

The present inventors have now identified 
characteristic sequences of plant Rb proteins and 
corresponding encoding polynucleotides for the first 
time, isolated such a protein and polynucleotide, and 
particularly have identified sequences that distinguish 
it from known animal Rb protein sequences. The inventors 
have determined that a known DNA sequence from the maize 
encoding a vegetable Rb plant protein and is hereinafter 
called ZmRbl. ZmRbl has been demonstrated by the 
inventors to interact in yeasts with RepA, a plant 
geminivirus protein containing LXCXE motif essential for 
its function. The inventors have further determined that 
geminivirus DNA replication is reduced in plant cells 
transfected with plasmids encoding either ZmRbl or human 
pl3 0, a member of the human Rb family. 

Significantly the inventors work suggests that plant 
and animal cells may share fundamentally similar 
strategies for growth control, and thus human as well as 
plant Rb protein such as ZmRbl will be expected to have 
utility in, inter alia, plant therapeutics, diagnostics, 
growth control or investigations and many such plant 
proteins will have similar utility in animals. 

In a first aspect of the present invention there is 
provided the use of retinoblastoma protein in controlling 
the growth of plant cells and/or plant viruses. 
Particularly, the present invention provides control of 
viral infection and/or growth in plant cells wherein the 
virus requires the integrity of an LXCXE amino acid motif 
in one of its proteins, particularly, e. g., in the viral 
RepA protein, for normal reproduction. Particular plant 
viruses so controlled are Geminiviruses . 

A preferred method of control using such proteins 
involves applying these to the plant cell, either 
directly or by introduction of DNA or RNA encoding for 



their expression into the plant cell which it is desired 
to treat. By over expressing the retinoblastoma protein, 
or expressing an Rb protein or peptide fragment thereof 
that interacts with the LXCXE motif of the virus but does 
not affect the normal functioning of the cell, it is 
possible to inhibit normal virus growth and thus also to 
produce infection spreading from that cell to its 
neighbours. 

Alternatively, by means of introducing anti-sense DNA 
or RNA in plant cells in vectors form that contain the 
necessary promoters for the DNA or RNA transcription, it 
will be possible to exploit the well known anti-sense 
mechanism in order to inhibit the expression of the Rb 
protein, and thus the S-phase. Such plants will be of 
use, among other aspects to replicate DNA or RNA until 
high levels, e.g. in yeasts. The methods to introduce 
anti-sense DNA in cells are very well known for those 
skilled in the art: see for example "Principles of gene 
manipulation - An introduction to Genetic Engineering 
(1994) R.W. Old & S.B. Primrose; Oxf ord-Blackwell 
Scientific Publications Fifth Edition p398. 

In a second aspect of the present invention there is 
provided recombinant nucleic acid, particularly in the 
form of DNA or cRNA (mRNA) , encoding for expression of Rb 
protein that is characteristic of plants. This nucleic 
acid is characterised by one or more characteristic 
regions that differ from known animal Rb protein nucleic 
acid and is exemplified herein by SEQ ID No 1, bases 31- 
2079. 

The DNA or RNA can have a sequence that contains the 
degenerated substitution in the nucleotides of the codons 
in SEQ ID No. 1, and in where the RNA the T is U. The 
most preferred DNA or RNA are capable of hybridate with 
the polynucleotide of the SEQ ID No. 1 in conditions of 
low stringency, preferably being the hybridization 



produced in conditions of high stringency. 

The expressions "conditions of low stringency" and 
"conditions of high stringency" are understood by those 
skilled, but are conveniently exemplified in US 5202257, 
Col-9-Col 10. If some modifications were made to lead to 
the expression of a protein with different amino acids, 
preferably of the same kind of the corresponding amino 
acids to the SEQ ID No 1; that is, are conservative 
substitutions. Such substitutions are known by those 
skilled, for example, see US 5380712, and it is only 
contemplated when the protein has activity with 
retinoblastoma protein. 

Preferred DNA or cRNA encodes for a plant Rb protein 
having A and B pocket sub-domains having between 3 0% and 
75% homology with human Rb protein, particularly as 
compared with pl3 0, more preferably from 50% to 64% 
homology. Particularly the plant Rb protein so encoded 
has the C706 amino acid of human Rb conserved. Preferably 
the spacer sequence between the A and B pockets is not 
conserved with respect to animal Rb proteins, preferably 
being less than 50% homologous to- the same region as 
found in such animal proteins. Most preferably the 
protein so encoded has 80% or more homology with that of 
SEQ NO 2 of the sequence listing attached hereto, still 
more preferably 90% or more and most preferably 95% or 
more. Particularly provided is recombinant DNA of SEQ ID 
No 1 bases 31 to 2079, or the entire SEQ ID No 1, or 
corresponding RNAs, encoding for maize cDNA clone 
encoding ZmRbl of SQ ID No 2. 

In a third aspect of the present invention there is 
provided the protein expressed by the recombinant DNA or 
RNA of the second aspect, novel proteins derived from 
such DNA or RNA, and protein derived from naturally 
occurring DNA or RNA by mutagenic means such as use of 
mutagenic PCR primers. 



In a fourth aspect there are provided vectors, cells 
and plants and animals comprising the recombinant DNA or 
RNA of correct sense or anti-sense, of the invention. 

In a particularly preferred use of the first aspect 
there is provided a method of controlling cell or viral 
growth comprising administering the DNA, RNA or protein 
of the second or third aspects to the cell. Such 
administration may be direct in the case of proteins or 
may involve indirect means, such as electroporation of 
plant seed cells with DNA or by transformation of cells 
with expression vectors capable of expressing or over 
expressing the proteins of the invention or fragments 
thereof that are capable of inhibiting cell or viral 
growth* 

Alternatively, the method uses an expression vector 
capable of producing anti-sense RNA of the cDNA of the 
invention. 

Another one of the specific characteristics of the 
plants protein and of the nucleic acids includes a N- 
terminal domain corresponding in sequence to the amino 
acids 1 to 90 of the SEQ ID No. 2 and a nucleotides 
sequence corresponding to the basis 31 to 3 00 of the SEQ 
ID No. 1. These sequences are characterized by possessing 
less than 150 and less than 450 units that the animal 
sequences which possess more than 3 00 amino acids and 900 
pairs of more bases* 

The present invention will now be illustrated further 
by reference to the following non-limiting Examples. 
Further embodiments falling within the scope of the 
claims attached hereto will occur to those skilled in the 
light of these. 

Figures. 

Fig. 1. The sub-figure A shows the relative lengths of 
the present ZmRbl protein and the human retinoblastoma 
proteins. The sub-figure B shows the alignment of the 
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amino acids sequences of the Pocket A and Pocket B of the 
ZmRbl with that of the Xenopus, chicken, rat and three 
human protein (Rb, p!07 and p!30) . 

Fig. 2. This figure is a map of the main characteristics 
5 of the WDV virus and the pWori vector derived from WDV 
and the positions of the deletions and mutations used in 
order to establish that the LXCXE motif is required for 
its replication in plants cells. 
EXAMPLE 1. 

10 Isolation of DNA and protein expressing clones. 

Total RNA was isolated from maize root and mature 
j«S leaves by grinding the material previously frozen in 

IV liquid nitrogen essentially as described in Soni et al 

p (1995) . The major and minor p75ZmRbl mRNAs were 

15 identified by hybridization to a random-primed 3 2P- 
r s " labelled PstI internal fragment (1.4 kb) . 

M= A portion of a maize cDNA library (106 pfu) in 1ZAPII 

p (Stratagene) was screened by subsequent hybridization to 

|S 5 '-labelled oligonucleotides designed to be complementary 

O 20 to a known EST sequence of homologue maize of pl30. These 
oligonucleotides were 5 ' -AATAGACACATCGATCAA/G (M.5m, nt 
positions 1411-1438) and 5 ' -GTAATGATACCAACATGG (M.3c, nt 
positions 1606-1590) (Isogen Biosciences) . 

After the second round of screening, pBluescript SK- 
25 (pBS) phagemids from positive clones were isolated by in 
vivo excision with ExAssist helper phage (Stratagene) 
according to protocols recommended by the manufacturer. 
DNA sequencing was carried out using a SequenaseTM Kit 
(USB) . 

30 The 5 ' -end of the mRNAs encoding p75ZmRbl was 

determined by RACE-PCR. Poly-A+mRNA was purified by 
chromatography on oligo-dT-cellulose (Amersham) . The 
first strand was synthesized using oligonucleotide DraI35 
( 5 ' -GATTTAAAATCAAGCTCC , nt positions 113-96). After 

35 denaturation at 90°C for 3 min, RNA was eliminated by 



RNase treatment, the cDNA recovered and 5 '-tailed with 
terminal transferase and dATP. Then a PCR fragment was 
amplified using primer DraI35 and the linker-primer (50 
bp) of the Stratagene cDNA synthesis kit. 

One of the positive clones so produced contained a ~4 
kb insert that, according to restriction analysis, 
extended both 5' and 3' of the region contained in the 
Expressed Sequence Tag used. The nucleotide sequence 
corresponding to the longest cDNA insert (3747 bp) is 
shown in SEQ ID No. 1. This ZmRbl cDNA contains a single 
open reading frame capable of encoding a protein of 683 
amino acids (predicted Mr 75247, p75ZmRbl) followed by a 
1646 bp 3 '-untranslated region. Untranslated regions of 
similar length have been also found in mammalian Rb cDNAs 
(Lee, W.-L. et al, Science 235, 1394 (1987); Bernards, R. 
et al, Proc. Natl. Acad. Sci. USA 86, 6474 (1989)). 
Northern analysis indicates that maize cells derived from 
both root meristems and mature leaves contain a major 
message, ~2.7±0.2 kb in length. - In addition, a minor 
~3.7±0.2 kb message also appears. Heterogeneous 
transcripts have been detected in other species (Destree, 
O. H. J. et al, Dev. Biol. 153, 141 (1992)). 

Plasmid pWoriAA was constructed by deleting in pWori 
most of the sequences encoding WDV proteins (Sanz and 
Gutierrez, unpublished) . Plasmid p35S.Rbl was constructed 
by inserting the CaMV 3 5S promoter (obtained from 
pWDV3:35SGUS) upstream of the ZmRbl cDNA in the pBS 
vector. Plasmid p3 5S.13 0 was constructed by introducing 
the complete coding sequence of human pl3 0 instead of 
ZmRbl sequences into p35S.Rbl. Plasmid p35.A+B was 
constructed by substituting sequences encoding the WDV 
RepA and RepB ORFs instead of ZmRbl in p3 5S.Rbl plasmid. 
(See Soni, R. and Murray, J. A. H. Anal. Biochem. 218, 
474-476 (1994)). 

The sequence around the methionine codon at nucleotide 



position 31 contains a consensus translation start 
(Kozak, M. J. Mol. Biol. 196, 947 (1987)). To determine 
whether the cDNA contained the full-length ZmRbl coding 
region, the 5 '-end of the mRNAs was amplified by RACE-PCR 
using an oligonucleotide derived from a region close to 
the putative initiator AUG, which would produce a 
fragment of -150 bp. The results are consistent with the 
ZmRbl cDNA clone containing the complete coding region. 

The ZmRbl protein contains segments homologous to the 
A and B subdomains of the "pocket" that is present in all 
members of the Rb family. These subdomains are separated 
by a non-conserved spacer. ZmRbl also contains non- 
conserved N-terminal and C-terminal domains. Overall, 
ZmRbl shares -28-30% amino acid identity (-50% 
similarity) with the Rb family members (Hannon, G. J., 
Demetrick, D. & Beach, D. Genes Dev. 7, 2378 (1993); 
Cobrinik, D., Whyte, P., Peeper, D.S., Jacks, T. & 
Weinberg, R. A. ibid., p. 2392 (1993). Ewen, M. E. , Xing, 
Y. Lawrence, J. B. and Livingston, D. M. Cell 66, 1155 
(1991)) (Lee W. L. et al, Science 235, 1394 (1987); 
Bernards et al, Proc. Natl. Acad. Sci. USA 86, 6974 
(1989)), with the A and B subdomains exhibiting the 
highest homology (-50-64%) . Interestingly, amino acid 
C706 in human Rb, critical for its function (Kaye, F. J. , 
Kratzke R. A., Gerster, J. L. and Horowitz, J. M. Proc. 
Natl. Acad. Sci. USA 87, 6922 (1990)), is also conserved 
in maize p75ZmRbl. 

Note: The 561-577 amino acids encompass a proline-rich 
domain. 

ZmRbl contains 16 consensus sites, SP or TP for 
phosphor ilat ion by cyclins dependant kinases (CDKs) with 
one of the 5' -tail of the sub-domain A and several in the 
C-terminal area which are potential sites of 
phosphor ilat ion. A nucleic acid preferred group which 
encodes proteins in which one or more of these sites are 



changed or deleted, making the protein more resistant to 
the phosphor ilation and thus, to its functionality, for 
example linking to E2F or similar. This can be easily 
carried out by means of mutagenesis conducted by means of 
PCR. 

EXAMPLE 2 

In vivo activity. 

Replication of wheat dwarf geminivirus (WDV) is 
dependent upon an intact LXCXE motif of the viral RepA 
protein. This motif can mediate interaction with a member 
of the human Rb family, p!30, in yeasts. Therefore, the 
inventors investigated whether p75ZmRbl could complex 
with WDV RepA by using the yeast two-hybrid system 
(Fields, S. and Song, 0. Nature 340, 245-246 (1989)). 
Yeast cells were co-transformed with a plasmid encoding 
the fusion GAL4 BD-RepA protein and with plasmids encoding 
different GAL4AD fusion protein. The GAL4AD-p75ZmRbl 
fusion could also complex with GAL 4 B D -RepA to allow 
growth of the recipient yeast cells in the absence of 
histidine. This interaction was slightly stronger than 
that seen with the human pl3 0 protein. RepA could also 
bind to some extent to a N-terminally truncated form of 
p75ZmRbl. The role of the LXCXE motif in RepA-p75ZmRbl 
interaction was assessed using a point mutation in WDV 
RepA (E198K) which we previously showed to destroy 
interaction with human p!30. Co-transformation of ZmRbl 
with a plasmid encoding the fusion GAL4BD-RepA (E198K) 
indicated that the interaction between RepA and p7 5 ZmRbl 
occurred through the LXCXE motif. 

In this respect, the E198K mutant of WDV RepA behaves 
similarly to analogous point mutants of animal virus 
oncoproteins (Moran, E . , Zerler, B., Harrison, T. M. and 
Mathews, M.B. Mol. Cell Biol. 6, 3470 (1986); Cherington, 
V. et al., ibid., p. 1380 (1988); Lillie, J. W., 
Lowenstein, P. M. , Green, M. R. and Green, M. Cell 50, 



1091 (1987); DeCarpio, J. A, et al. , ibid., p. 275 
(1988) ) . 

Specific interaction between maize p75ZmRbl and WDV 
RepA in the yeast two-hybrid system (Fields et al) relied 
on the ability to reconstitute a functional GAL4 activity 
from two separated GAL4 fusion proteins containing the 
DNA binding domain (GAL4BD) and the activation domain 
(GAL4AD) . Yeast HF7c cells were co-transformed with a 
plasmid expressing the GAL4 BD-RepA or the GAL4BD- 
RepA(E198K) fusions and the plasmids expressing the 
GAL4AD alone (Vec) or fused to human pl3 0, maize p75 
(p75ZmRbl) or a 69 amino acids N-terminal deletion of p75 
(p75ZmRbl-DN) . Cells were streaked on plates with or 
without histidine according to the distribution shown in 
the upper left corner. The ability to grow in the absence 
of histidine depends on the functional reconstitution of 
a GAL4 activity upon interaction of the fusion proteins, 
since this triggers expression of the HIS3 gene which is 
under the control of a GAL4 responsive element. The 
growth characteristics of these yeast co-transf ormants 
correlate with the levels of b-galactosidase activity. 

Procedures for two-hybrid analysis are described in Xie 
et al (1995) . The GAL 4 AD - ZmRb 1 fusions were construed in 
the pGAD424 vector. 
EXAMPLE 3 
In vivo activity. 

Geminivirus DNA replication requires the cellular DNA 
replication machinery as well as other S-phase specific 
factors (Davies, J. W. and Stanley, J. Trends Genet. 5, 
77 (1989) ; Lazarowitz, S. Crit. Rev. Plant Sci. 11, 327 
(1992)). Consistent with this requirement, geminivirus 
infection appears to drive non-proliferating cells into 
S-phase, as indicated by the accumulation of the 
proliferating cell nuclear antigen (PCNA) , a protein 
which is not normally present in the nuclei of 
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differentiated cells (Nagar, S., Pedersen, T. J., 
Carrick, K. M. , Hanley-Bowdoin, L. and Robertson, D. 
Plant Cell 7, 705 (1995)). The inventors finding that 
efficient WDV DNA replication requires an intact LXCXE 
motif in RepA coupled with the discovery of a plant 
homolog of Rb supports the model that, as in animal 
cells, sequestration of plant Rb by viral RepA protein 
promotes inappropriate entry of infected cells into S- 
phase. Therefore, one way to investigate the function of 
p75ZmRbl was to measure geminivirus DNA replication in 
cells transfected with a plasmid bearing the ZmRbl 
sequences under a promoter functional in plant cells, an 
approach analogous to that previously used in human cells 
(Uzvolgi, E. et al., Cell Growth Diff 2, 297 (1991)). 
Accumulation of newly replicated viral plasmid DNA was 
impaired in wheat cells transfected with plasmids 
expressing p75ZmRbl or human p!30, when expression of WDV 
replication protein (s) is directed wither by the WDV 
promoter or by the CaMV 35S promoter. 

Since WDV DNA replication requires an S-phase cellular 
environment, interference with viral DNA replication by 
p75ZmRbl and human pl30 strongly evidences a role for 
retinoblastoma protein in the control of the Gl/S 
transition in plants. The existence of a plant Rb homolog 
implies that despite their ancient divergence, plant and 
animal cells use, at least in part, similar regulatory 
proteins and pathways for cell cycle control. 

Two lines of evidences reinforce this model. First, a 
gene encoding a protein that complements specifically the 
Gl/S, but not the G2/M transition of the budding yeast 
cdc28 mutant has been identified in alfalfa cells (Hirt, 
H., Pay, A., BSgre, L., Meskiene, I. and Heberle-Bors, E. 
Plant J. 4, 61 (1993)). Second, plant homologs of D-type 
cyclins have been isolated from Arabidopsis and these, 
like their mammalian relatives, contain LXCXE motifs. In 



s 
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concert with plant versions of CDK4 and CDK6, plant D- 
type cyclins may regulate passage through Gl phase by 
controlling the phosphorylation state of Rb-like 
proteins. 

5 In animal cells, the Rb family has been implicated in 

tumor suppression and in the control of differentiation 
and development. Thus, p75ZmRbl could also play key 
regulatory roles at other levels during the plant cell 
life. One key question that is raised by the existence of 
^ 10 Rb homologs in plant cells in whether, as in animals 
O disruption of the Rb pathway leads to a tumor-prone 

y, condition. In this regard, the inventors have noted that 

in the VirB4 protein encoded by the Ti plasmids of both 

r; Agrobacterzum tumefaciens and A. rhyzogenes contains an 

fj! 15 LXCXE motif. Although the VirB4 protein is required for 
tumor induction (Hooykas, P. J. J. and Beijersbergen, A. 
fy G. M. Annu. Rev. Phytopathol. 32, 157 (1994), the 

JU function of its LXCXE motif in this context remains to be 

q examined. Geminivirus infection is not accompanied by 

M- 20 tumor development in the infected plant, but in some 
cases an abnormal growth of enactions has been observed 
(G. Dafalla and B. Gronenborn, personal communication) . 

Inhibition of wheat dwarf geminivirus (WDV) DNA 
replication by ZmRbl or human p!3 0 in cultured wheat 
25 cells was carried out as follows. A. Wheat cells were 
transfected, as indicated, with pWori (Xie et al. 1995) 
alone (0.5g) , a replicating WDV-based plasmid which 
encodes WDV proteins required for viral DNA replication, 
and with control plasmid pBS (10 g) or p35S.Rbl (10 g) , 
3 0 which encodes ZmRbl sequences under the control of the 
CaMV 35S promoter. Total DNA was purified one and two 
days after transf ection, equal amounts fractionated in 
agarose gels and ethidium bromide staining and viral 
pWori DNA identified by Southern hybridization. Plasmid 
3 5 DNA represents exclusively newly-replicated plasmid DNA 



since it is fully resistant to Dpnl digestion and 
sensitive to Mbol. Note that the Mbol-digested samples 
were run for about half of the length than the undigested 
samples* B. To test the effect of human pl30 on WDV DNA 
replication, wheat cells were co-transf ected with pWori 
(0.5 g) and plasmids pBS (control), p35S.Rbl or p35S.130 
(10 g in each case) . Replication of the test plasmid 
(pWori) was analyzed two days after transfection and was 
detected as described in part A using ethidium bromide 
staining; and Southern hybridization. C. To test the 
effect of ZmRbl or human pl3 0 on WDV DNA replication when 
expression of viral proteins was directed by the CaMV 35S 
promoter, the test plasmid pWori A A (which does not encode 
functional WDV replication proteins but replicates when 
they are provided by a different plasmid, i. e. pWori) 
was used. Wheat cells were co-transf ected, as indicated, 
with pWoriAA (0.25 g) , pWori (0.25 g) , p35S.A+B (6 g) , 
p35S.Rbl (10 g) and/or p35S.130 (10 g) . Replication of 
the test plasmid (pWoriAA) was analyzed 36 hours after 
transfection and was detected as described in part A 
using ethidium bromide staining; Southern hybridization. 
Plasmids pWori (Ml) and pWoriAA (M2; Sanz and Gutierrez, 
unpublished) , 100 pg in each case, were used as markers. 
Suspension cultures of wheat cells, transfection by 
particle bombardment and analysis of viral DNA 
replication were carried out as described in (Xie et al. 
1995) , except that DNA extraction was modified as in 
(Soni and Murray. Arnal. Biochem. 218, 474-476 (1995). 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 
(i) APPLICANT: 
(A) NAME: CRISANTO GUTIERREZ ARMENTA 
5 (A) NAME: QI XIE 

(A) NAME: ANDRES PELAYO SAN Z -BURGOS 

(A) NAME: PAULA SUAREZ-LOPEZ 

(B) STREET: CSIC-UAM, UNI VERS ID AD AUTONOMA, CANTOBLANCO 

(C) CITY: MADRID 
10 (E) COUNTRY: SPAIN 

(F) POSTAL CODE (ZIP): 28049 



fit (ii) TITLE OF THE INVENTION: PLANT PROTEINS 

|] (iii) NUMBER OF SEQUENCES: 2 

Z4 15 (iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.3 0 
20 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3747 base pairs 
25 (B) TYPE: nucleic acid ■ 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
30 (iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Zea mays 



35 



(ix) FEATURE: 

(A) NAME /KEY : CDS 
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(B) LOCATION: 31.. 2079 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 

GAATTCGGCA CGAGCAAAGG TCTGATTGAT ATG GAA TGT TTC CAG TCA AAT TTG 

Met Glu Cys Phe Gin Ser Asn Leu 
1 5 

GAA AAA ATG GAG AAA CTA TGT AAT TCT AAT AGC TGT AAA GGG GAG CTT 
Glu Lys Met Glu Lys Leu Cys Asn Ser Asn Ser Cys Lys Gly Glu Leu 
10 15 20 

GAT TTT AAA TCA ATT TTG ATC AAT AAT GAT TAT ATT CCC TAT GAT GAG 
Asp Phe Lys Ser He Leu He Asn Asn Asp Tyr He Pro Tyr Asp Glu 
25 30 35 40 

AAC TCG ACG GGG GAT TCC ACC AAT TTA GGA CAT TCA AAG TGT GCC TTT 
Asn Ser Tfar Gly Asp Ser Thr Asn Leu Gly His Ser Lys Cys Ala Phe 
45 50 55 

GAA ACA TTG GCA TCT CCC ACA AAG ACA ATA AAG AAC ATG CTG ACT GTT 
Glu Thr Leu Ala Ser Pro Thr Lys Thr lie I*ys Asn Met Leu Thr Val 
60 65 70 

CCT AGT TCT CCT TTG TCA CCA GCC ACC GGT GGT TCA GTC AAG ATT GTG 
Pro Ser Ser Pro Leu Ser Pro Ala Thr Gly Gly Ser Val Lys He Val 
75 80 35 

CAA ATG ACA CCA GTA ACT TCT GCC ATG ACG ACA GCT AAG TGG CTT CGT 
Gin Met Thr Pro Val Thr Ser Ala Met Thr Thr Ala Lys Trp Leu Arg 
90 95 100 

GAG GTG ATA TCT TCA TTG CCA GAT AAG CCT TCA TCT AAG CTT CAG CAG 
Glu Val He Ser Ser Leu Pro Asp Lys Pro Ser Ser Lys Leu Gin Gin 
105 HO 115 120 

TTT CTG TCA TCA TGC GAT AGG GAT TTG ACA AAT GCT GTC ACA GAA AGG 
Phe Leu Ser Ser Cys Asp Arg Asp Leu Thr Asn Ala Val Thr Glu Arg 
125 130 135 

GTC AGC ATA GTT TTG GAA GCA ATT TTT CCA ACC AAA TCT TCT GCC AAT 
Val Ser He Val Leu Glu Ala He Phe Pro Thr Lys Ser Ser Ala Asn 
140 145 ISO 

CGG GGT GTA TCG TTA GGT CTC AAT TGT GCA AAT GCC TTT GAC ATT CCG 
Arg Gly Val Ser Leu Gly Leu Asn Cys Ala Asn Ala Phe Asp He Pro 
155 ISO 165 

TGG GCA GAA GCC AGA AAA GTG GAG GCT TCC AAG TTG TAC TAT AGG GTA 
Trp Ala Glu Ala Arg Lys Val Glu Ala Ser Lys Leu Tyr Tyr Arg Val 
170 175 180 
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TTA GAG GCA ATC TGC AGA GCG GAG TTA CAA AAC AGC AAT GTA AAT AAT 
Leu Glu Ala lie Cys Arg Ala Glu Leu Gin Asn Ser Asn Val Asn Asn 
185 190 195 200 

CTA ACT CCA TTG CTG TCA AAT GAG CGT TTC CAC CGA TGT TTG ATT GCA 
Leu Thr Pro Leu Leu Ser Asn Glu Arg Phe His Arg Cys Leu Tie Ala 
205 210 215 

TGT TCA GCG GAC TTA GTA TTG GCG ACA CAT AAG ACA GTC ATC ATG ATG 
Cys Ser Ala Asp Leu Val Leu Ala Thr His Lys Thr Val He Met Met 
220 225 230 

TTT CCT GCT GTT CTT GAG AGT ACC GOT CTA ACT GCA TTT GAT TTG AGC 
Phe Pro Ala Val Leu Glu Ser Thr Gly Leu Thr Ala Phe Asp Leu Ser 
235 240 245 

AAA ATA ATT GAG AAC TTT GTG AGA CAT GAA GAG ACC CTC CCA AGA GAA 
Lys He lie Glu Asn Phe Val Arg Has Glu Glu Thr Leu Pro Arg Glu 
250 255 260 

TTG AAA AGG CAC CTA AAT TCC TTA GAA GAA CAG CTT TTG GAA AGC ATG 
Leu Lys Arg His Leu Asn Ser Leu Glu Glu Gin Leu Leu Glu Ser Met 
265 270 275 280 

GCA TGG GAG AAA GOT TCA TCA TTG TAT AAC TCA CTG ATT GTT GCC AGG 
Ala Trp Glu Lys Gly Ser Ser Leu Tyr Asn Ser Leu He Val Ala Arg 
285 290 295 

CCA TCT GTT GCT TCA GAA ATA AAC CGC CTT GGT CTT TTG GCT GAA CCA 
Pro Ser Val Ala Ser Glu He Asn Arg Leu Gly Leu Leu Ala Glu Pro 
300 305 310 

ATG CCA TCT CTT GAT GAC TTA GTG TCA AGG CAG AAT GTT CGT ATC GAG 
Met Pro Ser Leu A:5p Asp Leu Val Ser Arg Gin Asn Val Arg He Glu 
315 320 325 

GGC TTG CCT GCT ACA CCA TCT AAA AAA CGT GCT GCT GGT CCA GAT GAC 
Gly Leu Pro Ala Thr Pro Ser Lys Lys Arg Ala Ala Gly Pro Asp Asp 
330 335 340 

AAC GCT GAT CCT CGA TCA CCA AAG AGA TCG TGC AAT GAA TCT AGG AAC 
Asn Ala Asp Pro Arg Ser Pro Lys Arg Ser Cys Asn Glu Ser Arg Asn 
345 350 355 360 

ACA GTA GTA GAG CGC AAT TTG CAG ACA CCT CCA CCC AAG CAA AGC CAC 
Thr Val Val Glu Arg Asn Leu Gin Thr Pro Pro Pro Lys Gin Ser His 
365 370 375 

ATG GTG TCA ACT AGT TTG AAA GCA AAA TGC CAT CCA CTC CAG TCC ACA 
Met Val Ser Thr Ser Leu Lys Ala Lys Cys His Pro Leu Gin Ser Thr 
380 385 390 

TTT GCA AGT CCA ACT GTC TGT AAT CCT GTT GGT GGG AAT GAA AAA TGT 
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Phe Ala Ser Pro Thr Val Cys Asn Pro Val Gly Gly Asn Glu Lys Cys 
395 400 405 

GCT GAC GTG ACA ATT CAT ATA TTC TTT TCC AAG ATT CTG AAG TTG GCT 1302 
Ala Asp Val Thr lie His lie Phe Phe Ser Lys lie Leu Lys Leu Ala 
410 415 420 

GCT ATT AGA ATA AGA AAC TTG TGC GAA AGG GTT CAA TGT GTG GAA CAG 1350 
Ala lie Arg He Arg Asn Leu Cys Glu Arg Val Gin Cys Val Glu Gin 
425 430 435 440 

ACA GAG CGT GTC TAT AAT GTC TTC AAG CAG ATT CTT GAG CAA CAG ACA 1398 
Thr Glu Arg Val Tyr Asn Val Phe Lys Gin lie Leu Glu Gin Gin Thr 
445 450 455 

□ ACA TTA TTT TTT AAT AGA CAC ATC GAT CAA CTT ATC CTT TGC TGT CTT 1446 

Q Thr Leu Phe Phe Asn Arg His He Asp Gin Leu He Leu Cys Cys Leu 

ffj 460 465 470 

jji TAT GGT GTT GCA AAG GTT TGT CAA TTA GAA CTC ACA TTC AGG GAG ATA 1494 

Tyr Gly Val Ala Ly3 Val Cys Gin Leu Glu Leu Thr Phe Arg Glu He 
ffj 475 480 485 

.35 

5 ;: CTC AAC AAT TAC AAA AGA GAA GCA CAA TGC AAG CCA GAA GTT TTT TCA 154 2 

FS« Leu Asn Asn Tyr Lys Arg Glu Ala Gin Cys Lys Pro Glu Val Phe Ser 

«H 4 90 4 95 500 

ZJ. AGT ATC TAT ATT GGG AGT ACG AAC CGT AAT GGG GTA TTA GTA TCG CGC 15 90 

Ser He Tyr He Gly Ser Thr Asn Arg Asn Gly Val Leu Val Ser Arg 
505 510 515 520 

CAT GTT GGT ATC ATT ACT TTT TAC AAT GAG GTA TTT GTT CCA GCA GCG 1638 
His Val Gly He He Thr Phe Tyr Asn Glu Val Phe Val Pro Ala Ala 
525 530 535 

AAG CCT TTC CTG GTG TCA CTA ATA TCA TCT GGT ACT CAT CCA GAA GAC 1686 
Lys Pro Phe Leu Val Ser Leu He Ser Ser Gly Thr His Pro Glu Asp 
540 545 550 

AAG AAG AAT GCT AGT GGC CAA ATT CCT GGA TCA CCC AAG CCA TCT CCT 1734 
Lys Lys Asn Ala Ser Gly Gin He Pro Gly Ser Pro Lys Pro Ser Pro 
555 560 565 

TTC CCA AAT TTA CCA GAT ATG TCC CCG AAG AAA GTT TCA GCA TCT CAT 1782 
Phe Pro Asn Leu Pro Asp Met Ser Pro Lys Lys Val Ser Ala Ser His 
570 575 580 

AAT GTA TAT GTG TCT CCT TTG CGG CAA ACC AAG TTG GAT CTA CTG CTG 13 3 0 

Asn Val Tyr Val Ser Pro Leu Arg Gin Thr Lys Leu Asp Leu Leu Leu 
535 590 595 600 
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HI 



TCA CCA AGT TCC AGG AGT TTT TAT GCA TGC ATT GGT GAA GGC ACC CAT 
Ser Pro Ser Ser Arg Ser Phe Tyr Ala Cys He Gly Glu Gly Thr His 
605 610 615 

GCT TAT CAG AGC CCA TCT AAG GAT TTG GCT GCT ATA AAT AGC CGC CTA 
Ala Tyr Gin Ser Pro Ser Lys Asp Leu Ala Ala He Asn Ser Arg Leu 
620 625 630 

AAT TAT AAT GGC AGG AAA GTA AAC AGT CGA TTA AAT TTC GAC ATG GTG 
Asn Tyr Asn Gly Arg Lys Val Asn Ser Arg Leu Asn Phe Asp Met Val 
635 640 645 

AGT GAC TCA GTG GTA GCC GGC AGT CTG GGC CAG ATA AAT GGT GGT TCT 
Ser Asp Ser Val Val Ala Gly Ser Leu Gly Gin He Asn Gly Gly Ser 
650 655 660 

ACC TCG GAT CCT GCA GCT GCA TTT AGC CCC CTT TCA AAG AAG AGA GAG 
Thr Ser Asp Pro Ala Ala Ala Phe Ser Pro Leu Ser Lys Lys Arg Glu 
665 670 675 680 

ACA GAT ACT TGATCAATTA TAAATGGTGG CCTCTCTCGT ATATAGCTCA 
Thr Asp Thr 



CAGATCCGTG CTCCGTAGCA GTCTATTCTT CTGAATAAGT GGATTAACTG GAGCGATTTA 
ACTGTACATG TATGTGTTAG TGAGAAGCAG CA GTTTTTA G GCAGCAAACT GTTTCAAGTT 
AGCTTTTGAG CTATCACCAT TTCTCTGCTG ATTGAACATA TCCGCTGTGT AGAGTGCTAA 
TGAATCTTTA GTTTTCATTG GGCTGACATA ACAAATCTTT ATCCTAGTTG GCTGGTTGTT 
GGGAGGCATT CATCAGGGTT ATATTTGGTT GTCAAAAAGT ACTGTACTTA ATTCACATCT 
TTCACATTTT TCACTAGCAA TAGCAGCCCC AAATTGCTTT CCTGACTAGG AACATATTCT 
TTACAGGTAT AAGCATGCCA ACTCTAAACT ATATGAATCC TTTTTATATT CTCATTTTTA 
AGTACTTCTC TGTTTCTGCT ACTTTTGTAC TGTATATTTC CAGCTTCTCC ATCAGACTGA 
TGATCCCATA TTCAGTGTGC TGCAAGTGAT TTGACCATAT GTGGCTTATC CTTCAGGTAT 
GTCTCATGTT GTGACTTCAT TGCTGATTGC TTTTGTAATG GTACTGTTGA GTTCATTTCT 
GGTTACAATC AGCCTTTACT GCTTTATATT GTTCTACTAA TTTTGGCTTG CACAGCCAGG 
ACGATTGGTT TTCTGCATCA ATCAATCTTT TTTAGGACAA GATATTTTTG TATGCTACAC 
TTCCCAAATT GCAATTAATC CAGAAGTCTA CCTTGTTTTA TT CT ATT AGT TCTCAGCAAC 
AGT G AAT GAA TATGAATCAG TCATGCTGAT AGATGTTCAT CTGGTTATTC CAAACAATCT 



1878 



1926 



1974 



2022 



2070 



2119 



2179 
2239 
2299 
2359 
2419 
2479 
253 9 
2599 
2659 
2719 
2779 
2839 
2899 
2959 



GACATCGCAT CTCTTTCTGC AAGTGAGATG AAGAAAACCT GAAATGCTAT CACCATTTAA 



3019 
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AACATTGGCT TCTGGAAGTT CAGGTGATTA GCAGGAGACG TTCTGACATT GCCATTGACA 3079 

TGTACGGTAG TGATGGCAGG AGACGTTCTT AAACAGCAGC TGCTCCTTCA GCTTGTAATG 3139 

TCTGATTGTA TTGACCAAGA GCATCCACCT TGCCTTATGG TACTAACTGA ATGAGCTGGT 3139 

GACGCTGACT CATCTGCATA ATGGCAGATG CTTAACCATC TTTAGGAGCT CATGTCATGA 3 259 

TTCCAGCTGC ACCGTGTCAA ATGTGAAGGC CCTGCAAGGC TTTCCAGGCC GCACCAATCC 3319 

TGCTTGCTTC TTGAAGATAC ATATGGTGCC ACCTAAATAA AAGCTGTTTC TGGTT ATGT C 3379 

TGTCCTTGAC ATGTCAACAG ATTAGTGTTG GGTTGCAGTC ATGTGGTGTT TAAGTCTTGG 343 9 

AGAAGGCGAG AAGTCATTGC TGCCAGCATT GTGATCGTCA GGCACAGAAG TACTCAAAAG 3499 

p TGAGAGCTAC TTGTTGCGAG CAAACGGAGG GCGATATAGG TTGATAGCCA ATTTCAGTTC 3559 

I'll TCTATATACA AGCAGCGGAT TTTGTTTAGA GTTAGCTTTT GAGATGCATC ATTTCTTTCA 3619 

111 

jjjj CATCTGATTC TGTGTGTTGT AACTCGGAGT CGCGTAGAAG TTAGAATGCT AACTGACCTT 3679 

pl AATTTTCACC GAATAATTTG CTAGCGTTTT TCAGTATGAA ATCCTTGTCT TAAAAAAAAA 3739 

i 

y.. AAAAAAAA 3747 

C|1 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 683 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Glu Cys Phe Gin Ser Asn Leu Glu Lys Met Glu Lys Leu Cys Asn 
IS 10 15 

Ser Asn Ser Cys Lys Gly Glu Leu Asp Phe Lys Ser He Leu lie Asn 
20 25 30 

Asn Asp Tyr He Pro Tyr Asp Glu Asn Ser Thr Gly Asp Ser Thr Asn 
35 40 45 

Leu Gly His Ser Lys Cys Ala Phe Glu Thr Leu Ala Ser Pro Thr Lys 
50 55 60 



Thr lie Lys Asn Met Leu Thr Val Pro Ser Ser Pro Leu Ser Pro Ala 
65 70 75 80 



Thr Gly Gly Ser Val Lys lie Val Gin Met Thr Pro Val Thr Ser Ala 
85 90 95 

Met Thr Thr Ala Lys Trp Leu Arg Glu Val lie Ser Ser Leu Pro Asp 
100 105 110 

Lys Pro Ser Ser Lys Leu Gin Gin Phe Leu Ser Ser Cys Asp Arg Asp 
115 120 125 

Leu Thr Asn Ala Val Thr Glu Arg Val Ser lie Val Leu Glu Ala lie 
130 135 140 

Phe Pro Thr Lys Ser Ser Ala Asn Arg Gly Val Ser Leu Gly Leu Asn 
145 150 155 ISO 

Cys Ala Asn Ala Phe Asp lie Pro Trp Ala Glu Ala Arg Lys Val Glu 
165 170 175 

Ala Ser Lys Leu Tyr Tyr Arg Val Leu Glu Ala lie Cys Arg Ala Glu 
180 185 190 

Leu Gin Asn Ser Asn Val Asn Asn Leu Thr Pro Leu Leu Ser Asn Glu 
195 200 205 

Arg Phe His Arg Cys Leu He Ala Cys Ser Ala Asp Leu Val Leu Ala 
210 215 220 

Thr His Lys Thr Val He Met Met Phe Pro Ala Val Leu Glu Ser Thr 
225 230 235 240 

Gly Leu Thr Ala Phe Asp Leu Ser Lys He He Glu Asn Phe Val Arg 
245 250 255 

His Glu Glu Thr Leu Pro Arg Glu Leu Lys Arg His Leu Asn Ser Leu 
260 265 270 

Glu Glu Gin Leu Leu Glu Ser Met Ala Trp Glu Lys Gly Ser Ser Leu 
275 280 285 

Tyr Asn Ser Leu He Val Ala Arg Pro Ser Val Ala Ser Glu He Asn 
290 295 300 

Arg Leu Gly Leu Leu Ala Glu Pro Met Pro Ser Leu Asp Asp Leu Val 
305 310 315 320 



Ser Arg Gin Asn Val Arg He Glu Gly Leu Pro Ala Thr Pro Ser Lys 
325 330 335 



Lys Arg Ala Ala Gly Pro Asp Asp Asn Ala Asp Pro Arg Ser Pro Lys 
340 345 350 
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Arg Ser Cys Asn Glu Ser Arg Asn Thr Val Val Glu Arg Asn Leu Gin 
355 360 365 

Thr Pro Pro Pro Lys Gin Ser His Met Val Ser Thr Ser Leu Lys Ala 

370 375 380 

Lys Cys His Pro Leu Gin Ser Thr Phe Ala Ser Pro Thr Val Cys Asn 
385 390 395 400 

Pro Val Gly Gly Asn Glu Lys Cys Ala Asp Val Thr He His He Phe 
405 410 415 

Phe Ser Lys He Leu Lys Leu Ala Ala He Arg He Arg Asn Leu Cys 
420 425 430 

Glu Arg Val Gin Cys Val Glu Gin Thr Glu Arg Val Tyr Asn Val Phe 
435 440 445 

Lys Gin He Leu Glu Gin Gin Thr Thr Leu Phe Phe Asn Arg His He 
450 455 4S0 

Asp Gin Leu He Leu Cys Cys Leu Tyr Gly Val Ala Lys Val Cys Gin 
465 470 475 480 

Leu Glu Leu Thr Phe Arg Glu He Leu Asn Asn Tyr Lys Arg Glu Ala 
485 490 495 

Gin Cys Lys Pro Glu Val Phe Ser Ser He Tyr He Gly Ser Thr Asn 
500 505 510 

Arg Asn Gly Val Leu Val Ser Arg His Val Gly He He Thr Phe Tyr 
515 520 525 

Asn Glu Val Phe Val Pro Ala Ala Lys Pro Phe Leu Val Ser Leu He 
530 535 540 

Ser Ser Gly Thr His Pro Glu Asp Lys Lys Asn Ala Ser Gly Gin He 
545 550 555 560 

Pro Gly Ser Pro Lys Pro Ser Pro Phe Pro Asn Leu Pro Asp Met Ser 
565 570 575 

Pro Lys Lys Val Ser Ala Ser His Asn Val Tyr Val Ser Pro Leu Arg 
580 585 590 

Gin Thr Lys Leu Asp Leu Leu Leu Ser Pro Ser Ser Arg Ser Phe Tyr 
595 600 605 

Ala Cys He Gly Glu Gly Thr His Ala Tyr Gin Ser Pro Ser Lys Asp 
610 615 620 



Leu Ala Ala He Asn Ser Arg Leu Asn Tyr Asn Gly Arg Lys Val Asn 
625 630 635 640 
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Ser Arg Leu Asn Phe Asp Met Val Ser Asp Ser Val Val Ala Gly S< 
645 650 655 

Leu Gly Gin lie Asn Gly Gly Ser Thr Ser Asp Pro Ala Ala Ala P) 
660 665 670 

Ser Pro Leu Ser Lys Lys Arg Glu Thr Asp Thr 
675 630 
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INDICATION REGARDING THE DEPOSIT OF A MICRO-ORGANISM 



The micro-organism referred to on page 7 of the description has been deposited in the 
following institution: 

COLECCION ESPANOLA DE CULTIVOS TIPO (CECT) 

Departamento de Microbiologia 

Facultad de Ciencias Biologicas 
N 46100 BURJASOT (Valencia) 
O Spain 

m 

p; Identification of the Micro-organism deposited: pBS.Rbl 

on 

Date of Deposit: 12 June 1996 

m 

On Order number: 4699 

c 

These indications are reflected on form PCE/RO/134, enclosed with the request. 



